AITopics | input resolution

Dynamic Resolution Network

Neural Information Processing SystemsApr-27-2026, 13:28:22 GMT

Deep convolutional neural networks (CNNs) are often of sophisticated design with numerous learnable parameters for the accuracy reason. To alleviate the expensive costs of deploying them on mobile devices, recent works have made huge efforts for excavating redundancy in pre-defined architectures. Nevertheless, the redundancy on the input resolution of modern CNNs has not been fully investigated, i.e., the resolution of input image is fixed. In this paper, we observe that the smallest resolution for accurately predicting the given image is different using the same neural network. To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample. Wherein, a resolution predictor with negligible computational costs is explored and optimized jointly with the desired network.

artificial intelligence, machine learning, resolution, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

Neural Information Processing SystemsApr-24-2026, 19:12:26 GMT

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs: the first several blocks have an order of magnitude larger memory usage than the rest of the network. To alleviate this issue, we propose a generic patch-by-patch inference scheduling, which operates only on a small spatial region of the feature map and significantly cuts down the peak memory. However, naive implementation brings overlapping patches and computation overhead. We further propose receptive field redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead. Manually redistributing the receptive field is difficult.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

fbb10d319d44f8c3b4720873e4177c65-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 01:35:38 GMT

dataset, val and test, vitpose, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

e56954b4f6347e897f954495eab16a88-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 15:46:33 GMT

neural network, resolution, resolution predictor, (12 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Zhejiang Province (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

1 Experiment Details

Neural Information Processing SystemsFeb-10-2026, 03:44:45 GMT

Clearly, with only 26M learnable parameters, the performance can be boosted from 79.9 to 84.4

artificial intelligence, machine learning, resolution, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 13:54:05 GMT

computation overhead, inference, peak memory, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Cognitive Science (0.71)

Add feedback

Dynamic Resolution Network

Neural Information Processing SystemsDec-25-2025, 02:52:50 GMT

Deep convolutional neural networks (CNNs) are often of sophisticated design with numerous learnable parameters for the accuracy reason. To alleviate the expensive costs of deploying them on mobile devices, recent works have made huge efforts for excavating redundancy in pre-defined architectures. Nevertheless, the redundancy on the input resolution of modern CNNs has not been fully investigated, i.e., the resolution of input image is fixed. In this paper, we observe that the smallest resolution for accurately predicting the given image is different using the same neural network. To this end, we propose a novel dynamic-resolution network (DRNet) in which the input resolution is determined dynamically based on each input sample. Wherein, a resolution predictor with negligible computational costs is explored and optimized jointly with the desired network.

dynamic resolution network, name change, resolution, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

CropVLM: Learning to Zoom for Fine-Grained Vision-Language Perception

Carvalho, Miguel, Dias, Helder, Martins, Bruno

arXiv.org Artificial IntelligenceNov-26-2025

Vision-Language Models (VLMs) often struggle with tasks that require fine-grained image understanding, such as scene-text recognition or document analysis, due to perception limitations and visual fragmentation. To address these challenges, we introduce CropVLM as an external low-cost method for boosting performance, enabling VLMs to dynamically ''zoom in'' on relevant image regions, enhancing their ability to capture fine details. CropVLM is trained using reinforcement learning, without using human-labeled bounding boxes as a supervision signal, and without expensive synthetic evaluations. The model is trained once and can be paired with both open-source and proprietary VLMs to improve their performance. Our approach delivers significant improvements on tasks that require high-resolution image understanding, notably for benchmarks that are out-of-domain for the target VLM, without modifying or fine-tuning the VLM, thus avoiding catastrophic forgetting.

large language model, machine learning, resolution, (20 more...)

arXiv.org Artificial Intelligence

2511.1982

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Impact of Image Resolution on Age Estimation with DeepFace and InsightFace

Jamo, Shiyar

arXiv.org Artificial IntelligenceNov-19-2025

Automatic age estimation is widely used for age verification, where input images often vary considerably in resolution. This study evaluates the effect of image resolution on age estimation accuracy using DeepFace and InsightFace. A total of 1000 images from the IMDB-Clean dataset were processed in seven resolutions, resulting in 7000 test samples. Performance was evaluated using Mean Absolute Error (MAE), Standard Deviation (SD), and Median Absolute Error (MedAE). Based on this study, we conclude that input image resolution has a clear and consistent impact on the accuracy of age estimation in both DeepFace and InsightFace. Both frameworks achieve optimal performance at 224x224 pixels, with an MAE of 10.83 years (DeepFace) and 7.46 years (InsightFace). At low resolutions, MAE increases substantially, while very high resolutions also degrade accuracy. InsightFace is consistently faster than DeepFace across all resolutions.

artificial intelligence, machine learning, resolution, (17 more...)

arXiv.org Artificial Intelligence

2511.14689

Country: Europe > Netherlands (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CARES: Context-Aware Resolution Selector for VLMs

Kimhi, Moshe, Shabtay, Nimrod, Giryes, Raja, Baskin, Chaim, Schwartz, Eli

arXiv.org Artificial IntelligenceOct-23-2025

Large vision-language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates visual tokens ofter to 97-99% of total tokens, resulting in high compute and latency, even when low-resolution images would suffice. We introduce \emph{CARES}-a \textbf{C}ontext-\textbf{A}ware \textbf{R}esolution \textbf{S}elector, a lightweight preprocessing module that, given an image-query pair, predicts the \emph{minimal} sufficient input resolution. CARES uses a compact VLM (350M) to extract features and predict when a target pretrained VLM's response converges to its peak ability to answer correctly. Though trained as a discrete classifier over a set of optional resolutions, CARES interpolates continuous resolutions at inference for fine-grained control. Across five multimodal benchmarks spanning documents and natural images, as well as diverse target VLMs, CARES preserves task performance while reducing compute by up to 80%.

machine learning, natural language, resolution, (17 more...)

arXiv.org Artificial Intelligence

2510.19496

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

input resolution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Dynamic Resolution Network

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

fbb10d319d44f8c3b4720873e4177c65-Supplemental-Conference.pdf

e56954b4f6347e897f954495eab16a88-Paper.pdf

1 Experiment Details

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

Dynamic Resolution Network

CropVLM: Learning to Zoom for Fine-Grained Vision-Language Perception

Impact of Image Resolution on Age Estimation with DeepFace and InsightFace

CARES: Context-Aware Resolution Selector for VLMs